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Abstract: We propose a novel statistical method for detection of objects 

in noisy images. The method uses results from percolation and random 
graph theories. We present an algorithm that allows to detect objects 

of unknown shapes in the presence of nonparametric noise of unknown 
level. The noise density is assumed to be unknown and can be very 
irregular. Our procedure substantially differs from wavelets-based algo- 
rithms. The algorithm has linear complexity and exponential accuracy 
and is appropriate for real-time systems. We prove results on consistency 
and algorithmic complexity of our procedure. 

Keywords and phrases: Image analysis, signal detection, image re- 
construction, percolation, noisy image, unknown noise. 



1. Introduction 

Assume we observe a noisy digital image on a screen of N x N pixels. Object 
detection and image reconstruction for noisy images are two of the corner- 
stone problems in image analysis. In this paper, we propose a new efficient 
technique for quick detection of objects in noisy images. Our approach uses 
mathematical percolation theory. 

Detection of objects in noisy images is the most basic problem of image 
analysis. Indeed, when one looks at a noisy image, the first question to ask is 
whether there is any object at all. This is also a primary question of interest 



in such diverse fields as, for example, cancer detection ( Ricci- Vitiani et al. 



(2007)), automated urban analysis (Negri et al. (2006)), detection of cracks 



in buried pipes (Sinha and Fieguth (2006)), and other possible applications 



in astronomy, electron microscopy and neurology. Moreover, if there is just 
a random noise in the picture, it doesn't make sense to run computation- 
ally intensive procedures for image reconstruction for this particular picture, 
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Surprisingly, the vast majority of image analysis methods, both in statis- 
tics and in engineering, skip this stage and start immediately with image 
reconstruction. 

The crucial difference of our method is that we do not impose any shape 
or smoothness assumptions on the boundary of the object. This permits the 
detection of nonsmooth, irregular or disconnected objects in noisy images, 
under very mild assumptions on the object's interior. This is especially suit- 
able, for example, if one has to detect a highly irregular non-convex object in 
a noisy image. This is usually the case, for example, in the aforementioned 
fields of automated urban analysis, cancer detection and detection of cracks 
in materials. Although our detection procedure works for regular images as 
well, it is precisely the class of irregular images with unknown shape where 
our method can be very advantageous. 

Many modern methods of object detection, especially the ones that are 
used by practitioners in medical image analysis require to perform at least 
a preliminary reconstruction of the image in order for an object to be de- 
tected. This usually makes such methods difficult for a rigorous analysis of 
performance and for error control. Our approach is free from this drawback. 



Even though some papers work with a similar setup (see Arias-Castro et al. 



(2005)), both our approach and our results differ substantially from this and 
other studies of the subject. We also do not use any wavelet-based techniques 
in the present paper. 

We view the object detection problem as a nonparametric hypothesis test- 
ing problem within the class of discrete statistical inverse problems. We as- 
sume that the noise density is completely unknown, and that it is not neces- 
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sarily smooth or even continuous. It is even possible that the noise distribu- 
tion doesn't have a density. 

In this paper, we propose an algorithmic solution for this nonparametric 
hypothesis testing problem. We prove that our algorithm has linear com- 
plexity in terms of the number of pixels on the screen, and this procedure 
is not only asymptotically consistent, but on top of that has accuracy that 
grows exponentially with the "number of pixels" in the object of detection. 
The algorithm has a built-in data-driven stopping rule, so there is no need 
in human assistance to stop the algorithm at an appropriate step. 

In this paper, we assume that the original image is black-and-white and 
that the noisy image is grayscale. While our focusing on grayscale images 
could have been a serious limitation in case of image reconstruction, it essen- 
tially does not affect the scope of applications in the case of object detection. 
Indeed, in the vast majority of problems, an object that has to be detected 
either has (on the picture under analysis) a color that differs from the back- 
ground colours (for example, in roads detection), or has the same colour 
but of a very different intensity, or at least an object has a relatively thick 
boundary that differs in colour from the background. Moreover, in practical 
applications one often has some prior information about colours of both the 
object of interest and of the background. When this is the case, the method 
of the present paper is applicable after simple rescaling of colour values. 

The paper is organized as follows. Our statistical model is described in 
details in Section |2} Suitable thresholding for noisy images is crucial in our 
method and is developed in Section [3j A new algorithm for object detection 
is presented in Section |4} Theorem [l] is the main result about consistency and 
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computational complexity of our testing procedure. An example illustrating 
possible applications of our method is given in Section |5| Section |6] is devoted 
to the proof of the main theorem. 

2. Statistical model 

Suppose we have a two-dimensional image. For numerical or graphical pro- 
cessing of images on computers, the image always has to be discretized. This 
is achieved via certain pixelization procedure. In our setup, we will be work- 
ing with images that are already discrete. 

In the present paper we are interested in detection of objects that have a 
known colour. This colour has to be different from the colour of the back- 
ground. Mathematically, this is equivalent to assuming that the true (non- 
noisy) images are black-and-white, where the object of interest is black and 
the background is white. 

In other words, we are free to assume that all the pixels that belong to the 
meaningful object within the digitalized image have the value 1 attached to 
them. We can call this value a black colour. Additionally, assume that the 
value is attached to those and only those pixels that do not belong to the 
object in the non-noisy image. If the number is attached to the pixel, we 
call this pixel white. 

In this paper we always assume that we observe a noisy image. The ob- 
served values on pixels could be different from and 1, so we will typically 
have a greyscale image in the beginning of our analysis. It is also assumed 
that on each pixel we have random noise that has the unknown distribution 
function F; the noise at each pixel is completely independent from noises on 
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other pixels. 

Let us formulate the model more formally. We have an x array of 
observations, i.e. we observe A^^ real numbers {^j}f^-=i- Denote the true 
value on the pixel («, j), ^ < i,j < N, by Iniij, and the corresponding noise 
by aSij. According to the above, 

Yij = Irriij + aSij , (1) 
where 1 < i,j < N, and {sij}, 1 < i,j < N are i.i.d., and 



. 1, if (i, j) belongs to the obiect: 
Im,^={ (2) 
0, if does not belong to the object. 

To stress the dependence on the noise level a, we write our assumption on 
the noise in the following way: 



Eij ~ F, Esij = 0, VarCij = 1 . (3) 

The null hypothesis is Hq : Irriij = for all The alternative hypothesis is 
Hi : Irriij 7^ for some i, j. It is important that we consider the case of a fully 
nonparametric noise of unknown level and having an unknown distribution. 
In principle, for applications of our method the noise doesn't need to be 
symmetric, and it is not necessary that the noise has mean and finite 
variance. Under certain restrictions, our testing procedure is consistent in 



these more difficult situations as well, see Langovoy and Wittich (2009a) as 
an example. 
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Now we can proceed to preliminary quantitative estimates. If a pixel 
is white in the original image, let us denote the corresponding probability 
distribution of Yij by Pq. For a black pixel we denote the corresponding 
distribution of Y^j by Pi. We are free to omit dependency of Pq and Pi on 
i and j in our notation, since all the noises are independent and identically 
distributed. 

Let F denotes the common distribution function of ej/s. Throughout this 
paper we will additionally assume that the following non-degeneracy assump- 
tion holds. 



(A) F{t) = C for all t e (a, 6) ^ b-a< 1. (4) 

Proposition 1. // (A) holds and the distribution of the noise is symmetric, 
then 

Po{Yij>l/2) < 1/2, (5) 
1/2 < Pi(r,, >l/2). (6) 

Proof. (Proposition [T]) Since the noise is symmetric, assumption {A) yields 

Pi{Yij > 1/2) = P{e + 1> 1/2) 
= P(5>-l/2) 

= P(£<l/2) 

> P(£<0) = l/2. 
For the other part, we have in view of the previous calculation 
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Po{Y,j>l/2) 



P{e > 1/2) 



1-P(£< 1/2) 



< 1/2. 



This completes the proof. 



□ 



This simple observation is crucial for the present paper. 

3. Thresholding and percolation on triangular lattices 

Now we are ready to describe one of the main ingredients of our method: the 
thresholding. The idea of the thresholding is as follows: in the noisy grayscale 



black. Then we colour all those pixels black, irrespectively of the exact value 
of grey that was observed on them. We take into account the intensity of 
grey observed at those pixels only once, in the beginning of our procedures. 
The idea is to think that some pixel " seems to have a black colour" when it is 
not very likely to obtain the observed grey value when adding a "reasonable" 
noise to a white pixel. 

We colour white all the pixels that weren't coloured black at the previous 
step. At the end of this procedure, we would have a transformed vector of 
O's and I's, call it {^ij}ij=i- We will be able to analyse this transformed 
picture by using certain results from the mathematical theory of percolation. 
This is the main goal of the present paper. But first we have to give more 
details about the thresholding procedure. 
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Let us fix, for eacli A'", a real number aQ^N) > 0, ao{N) < 1, sucli tliat 
tfiere exists 0{N) e R satisfying the following condition: 

Po{Yij>e{N)) < ao{N). (7) 

In this paper we will always pick ao{N) = for all N e N, for some 
constant ao > 0. But we will need to have varying ao( ") in our future research. 

As a first step, we transform the observed noisy image {i^,j}i^=i in the 
following way: for all 1 < i, j < A?", 

1. If Yij > 9{N), set Yij := 1 (i.e., in the transformed picture the 
corresponding pixel is coloured black). 

2. If Yij < 9{N), set Yij := (i.e., in the transformed picture the 
corresponding pixel is coloured white). 

Definition 1. The above transformation is called thresholding at the level 
e{N). The resulting array {Yi^j^j^^ of N'^ values (O's and I's) is called a 
thresholded picture. 

One can think of pixels from {5^i,j}ij=i as of vertices of a planar graph. 
Let us colour these iV^ vertices with the same colours as the corresponding 
pixels. We obtain a graph Gn with N"^ black or white vertices and (so far) 
no edges. 

We add edges to Gfq in the following way. If any two black vertices are 
"neighbours" (in a way to be specified in the following two paragraphs), 
we connect these two vertices with a black edge. If any two white vertices 
are neighbours, we connect them with a white edge. We will not add any 
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edges between non-neighbouring points, and we will not connect vertices of 
different colours to each other. 

It is crucial how one defines neigbourhoods for vertices of Gn'- different 
definitions can lead to testing procedures with very different properties. The 
first and a very natural way is to view G^v as an x square subset of the 



Z lattice. We have shown that our method works in this case, see Langovoy 



and Wittich (2009a). 



However, it turns out that the method becomes truly nonparametric and 
robust when we view our black and white pixelized picture as a collection 
of black and white clusters on the very specific planar graph, namely, on an 
N X N subset of the triangular lattice (obtained from lattice by adding 
diagonals to every square on the lattice). In the present paper, we will work 
exclusively with triangular lattices. 

We perform 6* (A^)— thresholding of the noisy image {Yi,j}fj=i using with 
a very special value of 0{N). Our goal is to choose ^(A^) (and corresponding 
ao{N), see ([7])) such that: 



site 
c 1 



pf' < Pi{Y,,>e{N)) 



(9) 



where p^***^ is the critical probability for site percolation on (see Grimmett 



(1999), Kesten (1982)). In case if both (g and (g are satisfied, what do we 
get? 

Since is random, we actually observe the so-called site percolation on 
black vertices within the subset of T^. From this point, we can use results 
from percolation theory to predict formation of black and white clusters on 
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Gni as well as to estimate the number of clusters and their sizes and shapes. 
Relations (g and Q are crucial here. 

To explain this more formally, let us split the set of vertices Vn of the 
graph Gn into to groups: Vn = V'^ U where V'^ H 1/^"* = 0, and V'^ 
consists of those and only those vertices that correspond to pixels belonging 
to the original object, while V^"* is left for the pixels from the background. 
Denote the subgraph oi Gn with vertex set and denote G^* the 
subgraph of Gat with vertex set V^^. 

If Q and g are satisfied, we will observe a so-called supercritical percola- 
tion of black clusters on GJv^, and a suhcritical percolation of black clusters 
on G^*. Without going into much details on percolation theory (the neces- 
sary introduction can be found in Grimmett (1999) or Kesten ( 1982[ )), we 
mention that there will be a high probability of forming relatively large black 
clusters on G^, but there will be only little and scarce black clusters on G^*. 
The difference between the two regions will be striking, and this is the main 
component in our image analysis method. 

In this paper, mathematical percolation theory will be used to derive quan- 
titative results on behaviour of clusters for both cases. We will apply those 
results to build efficient randomized algorithms that will be able to detect 
and estimate the object {Imij}f,^i using the difference in percolation phases 



on Gi(P and G^*. 



But when can the key inequalities (|8| and ^ be simultaneously satisfied 
for an appropriate threshold 91 The following important proposition shows 
that, under very mild conditions, our method is asymptotically consistent for 
any noise level. 
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Proposition 2. On the triangular lattice ^ and ^ are always satisfied for 
e = 1/2. 

Proof. (Proposition |2| For tlie planar triangular lattice one has pf^'^ = 1/2 
Kesten (1982)). The statement follows from Proposition [l] □ 



see 



Proposition [2] explains the main reason for working with the triangular 
lattice: for this lattice, our method is asymptotically consistent for any noise 
level, and the natural threshold ^^(A^) = 1/2 is always appropriate. As we 
will see in the following section, this makes our testing procedure applicable 
in the case of unknown and nonsmooth nonparametric noise. 



4. Object detection 

We either observe a blank white screen with accidental noise or there is an 
actual object in the blurred picture. In this section, we propose an algorithm 
to make a decision on which of the two possibilities is true. This algorithm is 
a statistical testing procedure. It is designed to solve the question of testing 
Hq : lij = for all 1 < i, j < N versus Hi : lij = 1 for some i,j. 

Let us choose a{N) G (0, 1) - the probability of false detection of an object. 
More formally, a{N) is the maximal probability that the algorithm finishes 
its work with the decision that there was an object in the picture, while in 
fact there was just noise. In statistical terminology, a{N) is the probability 
of an error of the first kind. 

We allow a to depend on A^; a{N) is connected with complexity (and 
expected working time) of our randomized algorithm. 

Since in our method it is crucial to observe some kind of percolation in the 
picture (at least within the image), the image has to be "not too small" in or- 
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der to be detectable by the algorithm: one can't observe anything percolation- 
alike on just a few pixels. We will use percolation theory to determine how 
"large" precisely the object has to be in order to be detectable. Some size 
assumption has to be present in any detection problem: for example, it is 
hopeless to detect a single point object on a very large screen even in the 
case of a moderate noise. 

For an easy start, we make the following (way too strong) largeness as- 
sumptions about the object of interest: 

(Dl) Assume that the object contains a black square with the side of 



size 



at least ipim{N) pixels, where 



log 



hm— 1^=0. (10) 



(D2) lim^^ = oo. 

N^oo logiV 

(111 



Furthermore, we assume the obvious consistency assumption 

^im{N) <N. (12) 
Assumptions (Dl) and {D2) are sufficient conditions for our algorithm to 



work. They are way too strong for our purposes. It is possible to relax (11) 
and to replace a square in {Dl) by a triangle-shaped figure. 
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Although conditions (10) and (11) are of asymptotic character, most of 



the estimates used in our method are vahd for finite N as welL 

Now we are ready to formulate our Detection Algorithm. Fix the false 
detection rate a{N) before running the algorithm. 

Algorithm 1 (Detection). 

• Step 0. Find an optimal 6{N) (in our framework 6{N) := 1/2). 

• Step 1. Perform ^(A^)— thresholding of the noisy picture {^jj}f^=i- 

• Step 2. Until 

{{Black cluster of size (fimiN) is found} 

or 

{all black clusters are found}}, 
Run depth- first search (Tarjan ( 1972[ )) on the graph Gat of 



the 6'(A^)— thresholded picture {^jj}ij=i 

• Step 3. If black cluster of size ipim{N) was found, report that 

an object was detected 

• Step 4. If no black cluster was larger than ipim{N), report that 

there is no object. 

At Step 2 our algorithm finds and stores not only sizes of black clusters, but 
also coordinates of pixels constituting each cluster. We remind that ^(A^) is 
defined as in (7), Gn and {^jj}i^=i were defined in Section [sj and (pim{N) 
is any function satisfying ( [Io| . The depth-first search algorithm is a stan- 
dard procedure used for searching connected components on graphs. This 
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procedure is a deterministic algorithm. The detailed description and rigor- 



ous complexity analysis can be found in Tarjan (1972), or in the classic book 



A ho et al.| (jlQTSj), Chapter 5. 

Let us prove that Algorithm 1 works, and determine its complexity. 

Theorem 1. Suppose assumptions {A), (Dl) and {D2) are satisfied and the 
noise is symmetric. Then 

1. Algorithm 1 finishes its work in 0{N'^) steps, i.e. is linear. 

2. If there was an object in the picture, Algorithm 1 detects it with prob- 
ability at least (1 — exp{—Ci{a)ipim{N))). 

3. The probability of false detection doesn't exceed mm{a{N) , exp{—C2{(y)(pim{N))} 
for allN> N{a). 

The constants Ci > 0, C2 > and N{a) G N depend only on a. 

Theorem [1] means that Algorithm 1 is of quickest possible order: it is linear 
in the input size. It is difficult to think of an algorithm working quicker in 
this problem. Indeed, if the image is very small and located in an unknown 
place on the screen, or if there is no image at all, then any algorithm solving 
the detection problem will have to at least upload information about 0{N^) 
pixels, i.e. under general assumptions of Theorem [T| any detection algorithm 
will have at least linear complexity. 

Another important point is that Algorithm 1 is not only consistent, but 
that it has exponential rate of accuracy. 

It is also interesting to remark here that, although it is assumed that the 
object of interest contains a ^im{N) x ipim{N) black square, one cannot use 
a very natural idea of simply considering sums of values on all squares of 
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size ipim{N) X {pim{N) in order to detect an object. Neither some sort of 
thresholding can be avoided, in general Indeed, although this simple idea 
works very well for normal noise, it cannot be used in case of unknown and 
possibly irregular or heavy-tailed noise. For example, for heavy-tailed noise, 
detection based on non-thresholded sums of values over subsquares will lead 
to a high probability of false detection. Whereas the method of the present 
paper works just fine. 

5. Example 

In this section, we outline an example illustrating possible applications of 
our method. We start with a real greyscale picture of a neuron (see Fig. [T]). 
This neuron is an irregular object with unknown shape, and our method can 
be very advantageous in situations like this. 

Basing on this real picture, we perform the following simulation study. We 
add Gaussian noise of level a = 1.8 independently to each pixel in the image, 
and then we run Algorithm 1 on this noisy picture. A typical version of a 
noisy picture with this relatively strong noise can be seen on Fig. |2} We run 
the algorithm on 1000 simulated pictures. Note that we used Gaussian noise 
for illustrative purposes only. We did not make any use neither of the fact 
that the noise is normal nor of our knowledge of the actual noise level. 

As a result, the neuron was detected in 96.8% of all cases. At the same 
time, the probability of false detection was shown to be below 5%. Now we 
describe our experiment in more details. 

The starting picture (see Fig. [T]) was 450 x 450 pixels. White pixels have 
value and black pixels have value 1. Some pixels were grey already in the 
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Fig 1 . A part of a real neuron. 

original picture, but in practice this doesn't spoil the detection procedure. 

We used as a threshold 9 = 0.5. The thresholded version of Fig. |2]is shown 
on Fig. |3| As follows from Theorem [ij our testing procedure is asymptotically 
consistent. We have chosen a = 1.8 in our simulation study. In practice, 
Algorithm 1 can be consistently used for stronger noise levels for images of 
this size. 

Suppose the null hypothesis is true, i.e. there is no signal in the original 
picture. By running Algorithm 1 on empty pictures of size 450 x 450 with 
simulated noise of level a = 1.8 and 6 = 0.5, one can find that with prob- 
ability more than 95% there will be no black cluster of size 304 or more on 
the thresholded picture. Therefore, we considered as significant only those 
clusters that had more than 304 pixels. A different and much more efficient 

imsart -generic ver. 2007/04/13 file: Detection_Percolation_Version_4.tex date 



p. L. Davies, M. Langovoy and 0. Wittich/ Detection in noisy images and percolationl8 




Fig 2. A noisy picture. 



way of calculating v^(A^) for moderate sizes of N is proposed in Langovoy 



and Wittich (2009b). 



For moderate sample sizes, the algorithm is applicable in many situations 
that are not covered by Theorem 1. The object, of course, doesn't have to 
contain a square of size 303 x 303 in order to be detectable. In particular, for 
noise level a = 1.8, even objects containing a 40 x 40 square are consistently 
detected. The neuron on Fig.[T]passes this criterion, and Algorithm 1 detected 
the neuron 968 times out of 1000 runs. 



6. Proofs 

Before proving the main result, we shall state first the following theorem 
about subcritical site percolation on the standard triangular lattice T^. 

imsart -generic ver. 2007/04/13 file: Detection_Percolation_Version_4.tex date 



p. L. Davies, M. Langovoy and 0. Wittich/ Detection in noisy images and percolatiorASi 



1 t" > t 



Fig 3. A thresholded picture. 

Theorem 2. Consider site percolation with probability pq on T^. There exists 
a constant XsUe = ^sUeiPo) > such that 



Ppoi \C\>n) < e-"^^''=(«') , for all n > N{po) . 



(13) 



Here C denotes the open cluster containing the origin. 

Proof. (Theorem |2]): The triangular lattice satisfies conditions of the Theo- 



rem 5.1 in Kesten (1982), p. 83. Therefore, the second part of that Theorem 



(see equations (5. 12)- (5. 14) and the conclusion following them) ensures that 
there exist constants Ci = Ci{po) > 0, C2 = 6*2(^0) > such that 



Ppoi \C\>n) < Ci(po) e-"^2(*'°) , for all n > 1 . (14) 
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If Ci < 1, then (13) follows immediately. Otherwise, (13) follows from (14) 
for all n > N{po) and any XsUeiPo) '■= C3 = Csip) > such that N{po) and 
C3 satisfy the inequality 



iV(po) (C^2 - C3) > logCi. 



(15) 



□ 



We will also need to use the celebrated FKG inequality (see Fortuin et al. 



(1971), or Grimmett (1999), Theorem 2.4, p. 34; see also Grimmett's book 



for some explanation of the terminology). 

Theorem 3. If A and B are both increasing (or both decreasing) events on 
the same measurable pair (f2, J-"), then 



P{Ar]B) > P{A)P{B). 

Define -F/v(n) as the event that there is an erroneously marked black cluster 
of size greater or equal n, lying in the square of size N x N corresponding 
to the screen. (An erroneously marked black cluster is a black cluster on Gtv 
such that each of the pixels in the cluster was wrongly coloured black after 
the 6'— thresholding). 

Denote 

p,^t{N) := P( r,, > 1/2 I Jm,, = ) , (16) 

a probability of erroneously marking a white pixel outside of the image as 
black. 
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The next theorem is particularly useful when studying percolation on finite 
sublattices of the initial infinite lattice. 

Theorem 4. Suppose that < Pout{N) < pf*^. There exists a constant 
C3 = C^ipoutiN)) > such that 



Ppout{N){FN{n)) < exp( -nC^ipoutiN))) , for all n > ^im{N) . (17) 

Proof. (Theorem |4]): Denote by C{i,j) the largest cluster in the N x N 
screen (triangulated by diagonals of one orientation) containing the pixel 
with coordinates and by C(0) the largest black cluster on the same 

N X N screen containing 0. It doesn't matter for this proof which point is 
denoted by 0. By Theorem [2| for all i, j: 1 < N: 



Obviously, it only helped to inequalities (13) and (18) that we have limited 
our clusters to only a finite subset instead of the whole lattice T^. On a side 
note, there is no symmetry anymore between arbitrary points of the N x N 
finite subset of the triangular lattice; luckily, this doesn't affect the present 
proof. 

Since {|C(0)| > n} and {\C{i,j)\ > n} are increasing events (on the 
measurable pair corresponding to the standard random-graph model on Gat), 
we have that { |C(0)| < n} and { \C{i,j)\ < n} are decreasing events for all 
i, j. By FKG inequality for decreasing events, 
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PpoutiN){ \C{i,j)\ < n for all ^, j, 1 < ^, j < iV ) > 



n n PpoMN)i\C{^,J)\<n] 



> (by (18)) 



It follows that 



^ ^-n. XsiteiPout) ^ 



Af2 



Pp.u.iN){FN{n)) = Pp„„,(^)(3(z,j), l<i,J<N : |C(z,j)| > n) 

< X — ( 1 — 6""'**"'''='^^°"') 



k=0 



N2 



k=l 

jySg-n Asite(poiit) _|_ |^^jY2g-n Asite(Pout) j 



because we assumed in (17) that n > (pim{N), and '^im{N) 3> logA^. More- 
over, we see immediately that Theorem |4] follows now with some C3 such 

that < {Pout {N))< Kite {Pout (iV) ) . □ 

Now we establish the following useful lemma. Let Gn denote the N x N 
subset of T^, as defined in Section [s] of the present paper, denote its canonical 
matching graph by G^. We remind that is self- matching, and refer to 



Kesten (1982), Section 2.2 for the necessary definitions. Assuming that n < 



N, denote An be the event that there is an open (i.e., black) path in the 
rectangle [0,n] x [0,n] joining some vertex on its left side to some vertex on 
its right side. Similarly, let i?„ denote the event that there exists a closed 
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(i.e., white) path on G* joining a vertex on the top side of G* to a vertex on 
its bottom side. 

Remark 1. When speaking about black or white crossings of rectangles, we 
are free to assume that is embedded in the plane as a 7? lattice with diag- 



onals. See Kesten ( 1982 ) for a discussion of connections between percolation 



and various planar embeddings of regular lattices. 

Lemma 1. Let < p < 1 he a real number. Consider standard site percola- 
tion with probability p on the triangular lattice. Then 

1. Either An or Bn occurs. Moreover, An fl Bn = . 
2. 



Pp{An) + Pp{Bn) = 1 



(19) 



Pp{An) + Pl-p{An) = 1 



(20) 



Proof. (Lemma [T]). Statement 1 of the Lemma directly follows from Propo- 



sition 2.2 from Kesten (1982) (see also pp.398 - 402 of that book: there a 



rigorous proof of this proposition is presented, including necessary topologi- 
cal considerations). Statement 2 is an immediate consequence of Statement 
1 and definitions of percolation measures on G„ and G* . 

To complete the proof, note that and G^ are isomorphic, by Example 



(iii), pp. 19-20 of Kesten (1982). Since by definition a vertex of G* is black 
with probability 1 — p, we have that 



Pj,{Bn) = Pl-p{An) . (21) 
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This proves g. □ 

First we prove the following theorem: 

Theorem 5. Consider site percolation on lattice with percolation prob- 
ability p > pf*"^ = 1/2. Let An be the event that there is an open path in 
the rectangle [0, n] x [0, n] joining some vertex on its left side to some ver- 
tex on its right side. Let Mn be the maximal number of vertex- disjoint open 
left-right crossings of the rectangle [0,n] x [0,^]. Then there exist constants 
C4 = C^ip) > 0, C5 = C5{p) > 0, Ce = Ceip) > such that 

PpiAn) > l-(n + l)e-'^^", (22) 



Pp{Mn<C5n) < e-^«", (23) 
and both inequalities holds for all n > Ni{p). 

Proof. (Theorem [5]): Let LRk{n), < k < n, he the event that the point 



(0, k) of Gn is connected by a white (in other words, closed) path (that lies 
in the interior of G„) to some vertex on the right border of Gn- Denote 
by LR{n) the event that there exists a closed left-right crossing of G„. Let 
C((0, A;)) denotes the white cluster containing the point (0, fc), where we 
make a convention that this cluster is considered on the whole lattice T^. 
Then obviously 

LRk{n) C {00: |C((0,A;))| >n} (24) 
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and 



LR{n) C y LRkin). 



(25) 



fc=0 



Now (25) gives us 



Pi_p(Li?(n)) < VPi_p(Li?fc(n)) < (n + 1) max Pi_p(LPfc(n)) . (26) 

' • 0<A:<1 



fe=0 



Since 1 — p < ) S^t from (24) and Theorem 2 that for all k 



P,.p{LRk{n)) < Pi_p({|C((0,A;))| >n}) < e 



-C4 n 



(27) 



Combining (26) and (27) yields 



Pi_p(A„) = Pi^p{LR{n)) < (n + l)e-^^r 



(2^ 



Altogether, (20) and (28) imply (22). This proves the first half of Theorem 



El 



As about the second part of the proof, (23) is deduced from (22) with the 



help of Theorem 2.45 of Grimmett (1999). The derivation itself is presented 



at pp. 49-50 of Grimmett (1999); the only difference is that in our case one 



has to change "edges" by "vertices" in the proof from the book. Everything 
else works the same, since Theorem 2.45 is valid for all Bernoulli product 
measures on regular lattices; in particular. Theorem 2.45 applies for site 
percolation as well. This completes the proof of Theorem |5} □ 
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Proof. (Theorem [T]): I. First we prove the complexity result. 

The ^^(A^)— thresholding gives us {yi,j}fj=i and Gn in 0(A^^) operations. 
This finishes the analysis of Step 1. 



As for Step 2, it is known (see, for example, Aho et al. (1975), Chapter 5, 



or 



Tarjan (1972)) that the standard depth-first search finishes its work also 
in 0{N^) steps. It takes not more than 0{N'^) operations to save positions of 
all pixels in all clusters to memory , since one has no more than A^^ positions 
and clusters. This completes analysis of Step 2 and shows that Algorithm 1 
is linear in the size of the input data. 

II. Now we prove the bound on the probability of false detection. Denote 



Po^t{N) := P{ Y,, > 1/2 I Jm,, = ) , (29) 

a probability of erroneously marking a white pixel outside of the image as 
black. Under assumptions of Theorem [l| Pout{N) < pf^'^. The exponential 
bound on the probability of false detection follows trivially from Theorem |4} 

III. It remains to prove the lower bound on the probability of true detec- 
tion. Suppose that we have an object in the picture that satisfies assumptions 
of Theorem [T] Consider any Lpim{N) x (fimiN) square in this image. After 
6'— thresholding of the picture by Algorithm 1, we observe on the selected 
square a site percolation with probability 



pUN) := P{ Y,, > 1/2 I Im,, = 1 ) > pf" . 



Then, by (22) of Theorem there exists C4 = Ci{pimiN)) such that there 



will be at least one cluster of size not less than (pim{N) (for example, one 
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could take any of the existing left-right crossings as a part of such cluster), 
provided that is bigger than certain Ni{pim{N)); and all that happens 
with probability at least 

1 -ne"^*" > 1 - 

for some C3: < C3 < C4. Note that one can always weaken the constant 
C3 above in such a way that the estimate above starts to hold for all n > 1. 
Theorem [T] is proved. 

□ 
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